Because of the vast volume of data being produced by today's scientificsimulations, lossy compression allowing user-controlled information loss cansignificantly reduce the data size and the I/O burden. However, for large-scalecosmology simulation, such as the Hardware/Hybrid Accelerated Cosmology Code(HACC), where memory overhead constraints restrict compression to only onesnapshot at a time, the lossy compression ratio is extremely limited because ofthe fairly low spatial coherence and high irregularity of the data. In thiswork, we propose a pattern-matching (similarity searching) technique tooptimize the prediction accuracy and compression ratio of SZ lossy compressoron the HACC data sets. We evaluate our proposed method with differentconfigurations and compare it with state-of-the-art lossy compressors.Experiments show that our proposed optimization approach can improve theprediction accuracy and reduce the compressed size of quantization codescompared with SZ. We present several lessons useful for future researchinvolving pattern-matching techniques for lossy compression.
展开▼